首页> 外文OA文献 >The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes
【2h】

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

机译:共识编码序列(CCDS)项目:识别人类和小鼠基因组的通用蛋白质编码基因集

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.
机译:有效利用人类和小鼠基因组需要可靠地鉴定基因及其产物。尽管有多个公共资源提供注释,但是使用了不同的方法,可以导致基因,转录本和蛋白质的表达相似但不完全相同。协作共识编码序列(CCDS)项目使用稳定的标识符(CCDS ID)跟踪参考小鼠和人类基因组上的相同蛋白质注释,并确保它们在NCBI,Ensembl和UCSC基因组浏览器中始终显示。重要的是,该项目需要协调人工检查位点之间不一致的蛋白质注释,以及需要新证据表明需要修订的注释,以逐步收敛于人和小鼠参考基因组的完整蛋白质编码集,同时保持较高的可靠性和生物学准确性的标准。迄今为止,该项目已从17052个人类和16893个小鼠基因中鉴定出20159个人类和17707个小鼠共有编码区。三种评估方法表明,CCDS集中的条目极有可能代表真实的蛋白质,而不是CCDS中未包含的贡献群体的注释。因此,CCDS数据库集中了识别支持良好的,标注相同的蛋白质编码区域的功能。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号